Rank | Count | Beginning |
---|---|---|
764 | 22604 | A |
27048 | 8201 | Az |
50895 | 1821 | Ez |
81064 | 1596 | Nem |
40354 | 1346 | De |
57576 | 1241 | Ha |
49226 | 1081 | És |
43034 | 989 | Egy |
53829 | 622 | Ezt |
47419 | 588 | Ennek |
35274 | 525 | Azt |
62524 | 522 | Így |
88290 | 486 | S |
73399 | 433 | Még |
74943 | 402 | Mert |
95575 | 400 | Úgy |
42216 | 391 | E |
52038 | 380 | Ezek |
47056 | 376 | Én |
60814 | 376 | Hogy |
64317 | 367 | Itt |
39411 | 342 | Csak |
42246 | 338 | Ebben |
53002 | 320 | Ezért |
79018 | 306 | Most |
75469 | 301 | Mi |
35317 | 295 | Aztán |
76595 | 291 | Minden |
95599 | 282 | Ugyanakkor |
72262 | 279 | Már |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV